{
 "cells": [
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Copyright 2020 Google LLC\n",
    "#\n",
    "# Licensed under the Apache License, Version 2.0 (the \"License\");\n",
    "# you may not use this file except in compliance with the License.\n",
    "# You may obtain a copy of the License at\n",
    "#\n",
    "#     https://www.apache.org/licenses/LICENSE-2.0\n",
    "#\n",
    "# Unless required by applicable law or agreed to in writing, software\n",
    "# distributed under the License is distributed on an \"AS IS\" BASIS,\n",
    "# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n",
    "# See the License for the specific language governing permissions and\n",
    "# limitations under the License."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a target=\"_blank\" href=\"https://colab.research.google.com/github/GoogleCloudPlatform/keras-idiomatic-programmer/blob/master/books/deep-learning-design-patterns/Workshops/Novice/Deep%20Learning%20Design%20Patterns%20-%20Workshop%20-%20Chapter%20III%20-%201.ipynb\">\n",
    "<img src=\"https://www.tensorflow.org/images/colab_logo_32px.png\" />Run in Google Colab</a>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Deep Learning Design Patterns - Code Labs\n",
    "\n",
    "## Lab Exercise #4 - Get Familiar with Data Preprocessing - Answers\n",
    "\n",
    "## Prerequistes:\n",
    "\n",
    "    1. Familiar with Python\n",
    "    2. Completed Chapter III: Training Foundation\n",
    "\n",
    "## Objectives:\n",
    "\n",
    "    1. Reading and Resizing Images\n",
    "    2. Assembling images into datasets\n",
    "    3. Setting the datatype\n",
    "    4. Normalizing/Standardizing images"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Setup:\n",
    "\n",
    "Install the additional relevant packages to get started with Keras/OpenCV, and then import them."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Error processing line 1 of /home/jupyter/.local/lib/python3.7/site-packages/google-cloud-aiplatform-nspkg.pth:\n",
      "\n",
      "  Traceback (most recent call last):\n",
      "    File \"/opt/conda/lib/python3.7/site.py\", line 168, in addpackage\n",
      "      exec(line)\n",
      "    File \"<string>\", line 1, in <module>\n",
      "    File \"<frozen importlib._bootstrap>\", line 580, in module_from_spec\n",
      "  AttributeError: 'NoneType' object has no attribute 'loader'\n",
      "\n",
      "Remainder of file ignored\n",
      "Error processing line 1 of /opt/conda/lib/python3.7/site-packages/google-cloud-aiplatform-nspkg.pth:\n",
      "\n",
      "  Traceback (most recent call last):\n",
      "    File \"/opt/conda/lib/python3.7/site.py\", line 168, in addpackage\n",
      "      exec(line)\n",
      "    File \"<string>\", line 1, in <module>\n",
      "    File \"<frozen importlib._bootstrap>\", line 580, in module_from_spec\n",
      "  AttributeError: 'NoneType' object has no attribute 'loader'\n",
      "\n",
      "Remainder of file ignored\n",
      "Collecting opencv-python\n",
      "  Downloading opencv_python-4.4.0.44-cp37-cp37m-manylinux2014_x86_64.whl (49.5 MB)\n",
      "\u001b[K     |████████████████████████████████| 49.5 MB 57 kB/s  eta 0:00:01\n",
      "\u001b[?25hRequirement already satisfied, skipping upgrade: numpy>=1.14.5 in /opt/conda/lib/python3.7/site-packages (from opencv-python) (1.18.5)\n",
      "Installing collected packages: opencv-python\n",
      "  Attempting uninstall: opencv-python\n",
      "    Found existing installation: opencv-python 4.2.0.34\n",
      "    Uninstalling opencv-python-4.2.0.34:\n",
      "      Successfully uninstalled opencv-python-4.2.0.34\n",
      "Successfully installed opencv-python-4.4.0.44\n",
      "\u001b[33mWARNING: You are using pip version 20.1.1; however, version 20.2.3 is available.\n",
      "You should consider upgrading via the '/opt/conda/bin/python3.7 -m pip install --upgrade pip' command.\u001b[0m\n",
      "4.4.0\n"
     ]
    }
   ],
   "source": [
    "# Install OpenCV computer vision package\n",
    "!pip install -U opencv-python\n",
    "\n",
    "# Import OpenCV \n",
    "import cv2\n",
    "# We will also be using the numpy package in this code lab\n",
    "import numpy as np\n",
    "\n",
    "print(cv2.__version__)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Reading/Resizing Images\n",
    "\n",
    "Let's read in an image and then resize it for input vector (for CNN) as 128x128.\n",
    "\n",
    "    \n",
    "You fill in the blanks (replace the ??), make sure it passes the Python interpreter, and then verify it's correctness with the summary output.\n",
    "\n",
    "You will need to:\n",
    "\n",
    "    1. Set the parameter for reading an image in as color (RGB)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "(584, 612, 3)\n"
     ]
    }
   ],
   "source": [
    "from urllib.request import urlopen\n",
    "\n",
    "# Let's read in the image as a color image\n",
    "url = \"https://github.com/GoogleCloudPlatform/keras-idiomatic-programmer/blob/master/books/deep-learning-design-patterns/Workshops/Novice/apple.jpg?raw=true\"\n",
    "request = urlopen(url)\n",
    "img_array = np.asarray(bytearray(request.read()), dtype=np.uint8)\n",
    "# HINT: the parameter value for a color image\n",
    "image = cv2.imdecode(img_array, cv2.IMREAD_COLOR)\n",
    "\n",
    "# Let's verify we read it in correctly. We should see (584, 612, 3)\n",
    "print(image.shape)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Resize it to 128 x 128\n",
    "\n",
    "Okay, we see that the image is 584 (height) x 612 (width). Hum, that's not square. We could simply resize it to 128 x 128. But if we do that, the image will be skewed. Why? Because the original height and width are not the same, and if we resize them as-is to the same length, we will distort the aspect ratio,\n",
    "\n",
    "So, let's refit the image into a square frame and then resize it.\n",
    "\n",
    "You will need to:\n",
    "\n",
    "    1. Set the padding for the top and bottom."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "pad size 28\n",
      "padded image (588, 612, 3)\n"
     ]
    }
   ],
   "source": [
    "# Let's calculate the difference between width and height -- this should output 28\n",
    "pad = (612 - 584)\n",
    "print(\"pad size\", pad)\n",
    "\n",
    "# Split the padding evenly between the top and bottom\n",
    "# HINT: even means half.\n",
    "top = pad // 14\n",
    "bottom = pad // 14\n",
    "left = 0\n",
    "right = 0\n",
    "\n",
    "# Let's now make a copy of the image with the padded border.\n",
    "# cv2.BORDER_CONSTANT means use a constant value for the padded border.\n",
    "# [0, 0, 0] is the constant value (all black pixels)\n",
    "color = [0, 0, 0]\n",
    "image = cv2.copyMakeBorder(image, top, bottom, left, right, cv2.BORDER_CONSTANT, \n",
    "                               value=color)\n",
    "\n",
    "# This should output (612, 612, 3)\n",
    "print(\"padded image\", image.shape)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "resized image (128, 128, 3)\n"
     ]
    }
   ],
   "source": [
    "# Let's resize the image now to 128 x 128\n",
    "# HINT: The tuple parameter is height x width\n",
    "image = cv2.resize(image, (128, 128))\n",
    "\n",
    "# This should output (128, 128, 3)\n",
    "print(\"resized image\", image.shape)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Assemblying into a dataset.\n",
    "\n",
    "Let's read in a group of images, resize them to the same size and assembly into a dataset (i.e., a single numpy multidimensional array).\n",
    "\n",
    "You will need to:\n",
    "\n",
    "    1. Specify the numpy method to convert a list to a numpy array."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "ename": "SyntaxError",
     "evalue": "invalid syntax (<ipython-input-6-fd6c7b5974b2>, line 11)",
     "output_type": "error",
     "traceback": [
      "\u001b[0;36m  File \u001b[0;32m\"<ipython-input-6-fd6c7b5974b2>\"\u001b[0;36m, line \u001b[0;32m11\u001b[0m\n\u001b[0;31m    images = np.??(images)\u001b[0m\n\u001b[0m                ^\u001b[0m\n\u001b[0;31mSyntaxError\u001b[0m\u001b[0;31m:\u001b[0m invalid syntax\n"
     ]
    }
   ],
   "source": [
    "# Let's build a dataset of four images. We will start by using a list to append each image\n",
    "# as it is read in.\n",
    "images = []\n",
    "for _ in range(4):\n",
    "    # Let's pretend we are reading in different images and resizing them,\n",
    "    # but instead we will just reuse our image from above.\n",
    "    images.append(image)\n",
    "\n",
    "# convert the list of images to a numpy multidimensional array\n",
    "# HINT: use the method that converts list to numpy array\n",
    "images = np.??(images)\n",
    "\n",
    "# This should output (4, 128, 128, 3, where the 4 indicates this is a batch of 4 images.\n",
    "print(\"dataset\", images.shape)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Setting the Datatype\n",
    "\n",
    "Next, we will set the data type of the pixel data to a single precision floating point value. That's a FLOAT32, which means 32 bits (which is 4 bytes).\n",
    "\n",
    "You will need to:\n",
    "\n",
    "    1. Specify the data type for a 32-bit float."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Set the datatype to single precision float (FLOAT32)\n",
    "# HINT: It is lowercased.\n",
    "images = images.astype(np.??)\n",
    "\n",
    "# This should output: 4\n",
    "print(\"bytes per pixel\", images.itemsize)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Normalizing/Standardizing the Pixel Data\n",
    "\n",
    "Finally, we will standardize the pixel data:\n",
    "\n",
    "    1. Calculate the mean and standard deviation using numpy methods\n",
    "    2. Substract the mean from images and then divide by the standard deviation\n",
    " \n",
    "You will need to:\n",
    "\n",
    "    1. Subtract the standard deviation"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Calculate the mean value across all the pixels\n",
    "mean = np.mean(images)\n",
    "# Calculate the corresponding standard deviation\n",
    "std  = np.std(images)\n",
    "\n",
    "# Subtract the mean and divide by the standard deviation\n",
    "# HINT: you calculate the standard deviation above.\n",
    "n_images = (images - mean) / ??\n",
    "\n",
    "# Let's print the before and after values:\n",
    "# You should see: 177.9258 3.1789154e-07\n",
    "print(mean, np.mean(n_images))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## End of Lab Exercise"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.6"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
